Overview

Dataset Statistics

Number of Variables 18
Number of Rows 11762
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 9.7 MB
Average Row Size in Memory 861.7 B
Variable Types
  • Categorical: 15
  • GeoGraphy: 1
  • Numerical: 2

Dataset Insights

household_size is skewed Skewed
uniqueid has a high cardinality: 6916 distinct values High Cardinality
uid has a high cardinality: 11762 distinct values High Cardinality
year has constant length 4 Constant Length
location_type has constant length 5 Constant Length
rural_cellphone_access has constant length 1 Constant Length
urban_cellphone_access has constant length 1 Constant Length
provider has constant length 1 Constant Length
uid has all distinct values Unique

Variables


uniqueid

categorical

Approximate Distinct Count 6916
Approximate Unique (%) 58.8%
Missing 0
Missing (%) 0.0%
Memory Size 893.8 KB

Length

Mean 12.8101
Standard Deviation 0.4368
Median 13
Minimum 10
Maximum 13

Sample

1st row uniqueid_4858
2nd row uniqueid_3015
3rd row uniqueid_103
4th row uniqueid_4582
5th row uniqueid_2854

Letter

Count 94096
Lowercase Letter 94096
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 44814
  • uniqueid contains many words: 6916 words

uid

categorical

Approximate Distinct Count 11762
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory Size 977.6 KB

Length

Mean 20.1097
Standard Deviation 1.2277
Median 20
Minimum 16
Maximum 22

Sample

1st row Rwanda_uniqueid_48...
2nd row Tanzania_uniqueid_...
3rd row Rwanda_uniqueid_10...
4th row Rwanda_uniqueid_45...
5th row Tanzania_uniqueid_...

Letter

Count 168192
Lowercase Letter 156430
Space Separator 0
Uppercase Letter 11762
Dash Punctuation 0
Decimal Number 44814
  • uid contains many words: 11762 words

country

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 819.0 KB

Length

Mean 6.2996
Standard Deviation 1.1349
Median 6
Minimum 5
Maximum 8

Sample

1st row Rwanda
2nd row Tanzania
3rd row Rwanda
4th row Rwanda
5th row Tanzania

Letter

Count 74096
Lowercase Letter 62334
Space Separator 0
Uppercase Letter 11762
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Rwanda, Tanzania) take over 50.0%

year

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 792.6 KB

Length

Mean 4
Standard Deviation 0
Median 4
Minimum 4
Maximum 4

Sample

1st row 2016
2nd row 2017
3rd row 2016
4th row 2016
5th row 2017

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 47048
  • The top 2 categories (2016, 2018) take over 50.0%
  • year has words of constant length

gender_of_respondent

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 806.0 KB

Length

Mean 5.1736
Standard Deviation 0.9849
Median 6
Minimum 4
Maximum 6

Sample

1st row Male
2nd row Female
3rd row Male
4th row Female
5th row Male

Letter

Count 60852
Lowercase Letter 49090
Space Separator 0
Uppercase Letter 11762
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Female, Male) take over 50.0%

age_of_respondent

numerical

Approximate Distinct Count 82
Approximate Unique (%) 0.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 183.8 KB
Mean 38.6024
Minimum 16
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • age_of_respondent is skewed right (γ1 = 0.8516)

Quantile Statistics

Minimum 16
5-th Percentile 18
Q1 26
Median 35
Q3 48
95-th Percentile 71
Maximum 100
Range 84
IQR 22

Descriptive Statistics

Mean 38.6024
Standard Deviation 16.3346
Variance 266.8199
Sum 454041
Skewness 0.8516
Kurtosis 0.1613
Coefficient of Variation 0.4232
  • age_of_respondent has 151 outliers

location_type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 804.0 KB
  • The largest value (Rural) is over 1.52 times larger than the second largest value (Urban)

Length

Mean 5
Standard Deviation 0
Median 5
Minimum 5
Maximum 5

Sample

1st row Rural
2nd row Urban
3rd row Rural
4th row Rural
5th row Urban

Letter

Count 58810
Lowercase Letter 47048
Space Separator 0
Uppercase Letter 11762
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Rural, Urban) take over 50.0%
  • The largest value (rural) is over 1.52 times larger than the second largest value (urban)
  • location_type has words of constant length

job_type

categorical

Approximate Distinct Count 10
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 945.7 KB

Length

Mean 17.3339
Standard Deviation 3.9245
Median 19
Minimum 9
Maximum 28

Sample

1st row Farming and Fishin...
2nd row Self employed
3rd row Farming and Fishin...
4th row Farming and Fishin...
5th row Informally employe...

Letter

Count 188490
Lowercase Letter 170902
Space Separator 15332
Uppercase Letter 17588
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Self employed, Informally employed) take over 50.0%
  • The largest value (employed) is over 2.09 times larger than the second largest value (self)

education_level

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 956.5 KB
  • The largest value (Primary education) is over 2.86 times larger than the second largest value (No formal education)

Length

Mean 18.2693
Standard Deviation 2.5649
Median 17
Minimum 17
Maximum 31

Sample

1st row Primary education
2nd row Primary education
3rd row Secondary educatio...
4th row Primary education
5th row Primary education

Letter

Count 200443
Lowercase Letter 188202
Space Separator 14002
Uppercase Letter 12241
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Primary education, No formal education) take over 50.0%
  • The largest value (education) is over 1.77 times larger than the second largest value (primary)

relationship_with_head

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 886.8 KB
  • The largest value (Head of Household) is over 1.93 times larger than the second largest value (Spouse)

Length

Mean 12.2024
Standard Deviation 5.5071
Median 17
Minimum 5
Maximum 19

Sample

1st row Head of Household
2nd row Head of Household
3rd row Head of Household
4th row Head of Household
5th row Head of Household

Letter

Count 130261
Lowercase Letter 112141
Space Separator 13174
Uppercase Letter 18120
Dash Punctuation 90
Decimal Number 0
  • The top 2 categories (Head of Household, Spouse) take over 50.0%

marital_status

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 973.4 KB

Length

Mean 19.7461
Standard Deviation 4.8444
Median 20
Minimum 7
Maximum 23

Sample

1st row Divorced/Seperated
2nd row Single/Never Marri...
3rd row Married/Living tog...
4th row Married/Living tog...
5th row Single/Never Marri...

Letter

Count 212410
Lowercase Letter 186241
Space Separator 9407
Uppercase Letter 26169
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Married/Living together, Single/Never Married) take over 50.0%

household_size

numerical

Approximate Distinct Count 20
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 183.8 KB
Mean 3.7939
Minimum 1
Maximum 21
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • household_size is skewed right (γ1 = 1.004)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 2
Median 3
Q3 5
95-th Percentile 8
Maximum 21
Range 20
IQR 3

Descriptive Statistics

Mean 3.7939
Standard Deviation 2.2254
Variance 4.9525
Sum 44624
Skewness 1.004
Kurtosis 1.5287
Coefficient of Variation 0.5866
  • household_size is not normally distributed (p-value 3.6083929013146413e-10)
  • household_size has 173 outliers

cellphone_access

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 778.2 KB
  • The largest value (Yes) is over 2.95 times larger than the second largest value (No)

Length

Mean 2.747
Standard Deviation 0.4348
Median 3
Minimum 2
Maximum 3

Sample

1st row Yes
2nd row No
3rd row Yes
4th row No
5th row Yes

Letter

Count 32310
Lowercase Letter 20548
Space Separator 0
Uppercase Letter 11762
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Yes, No) take over 50.0%
  • The largest value (yes) is over 2.95 times larger than the second largest value (no)

bank_account

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 771.2 KB
  • The largest value (No) is over 5.98 times larger than the second largest value (Yes)

Length

Mean 2.1433
Standard Deviation 0.3504
Median 2
Minimum 2
Maximum 3

Sample

1st row No
2nd row No
3rd row No
4th row No
5th row No

Letter

Count 25209
Lowercase Letter 13447
Space Separator 0
Uppercase Letter 11762
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (No, Yes) take over 50.0%
  • The largest value (no) is over 5.98 times larger than the second largest value (yes)

rural_cellphone_access

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 758.1 KB

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 0
3rd row 1
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 11762
  • The top 2 categories (0, 1) take over 50.0%
  • rural_cellphone_access has words of constant length

urban_cellphone_access

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 758.1 KB
  • The largest value (0) is over 2.59 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 11762
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 2.59 times larger than the second largest value (1)
  • urban_cellphone_access has words of constant length

age

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 836.5 KB

Length

Mean 7.8289
Standard Deviation 2.6401
Median 6
Minimum 5
Maximum 11

Sample

1st row middle_age
2nd row adult
3rd row middle_age
4th row adult
5th row adult

Letter

Count 86444
Lowercase Letter 86444
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (adult, young_adult) take over 50.0%

provider

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 758.1 KB
  • The largest value (1) is over 4.56 times larger than the second largest value (0)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 11762
  • The top 2 categories (1, 0) take over 50.0%
  • The largest value (1) is over 4.56 times larger than the second largest value (0)
  • provider has words of constant length

Interactions

Correlations

Missing Values